rank | frequency | n-gram |
---|---|---|
1 | 5307 | -y |
2 | 4773 | -r |
3 | 4730 | -a |
4 | 4395 | -ň |
5 | 4119 | -i |
rank | frequency | n-gram |
---|---|---|
1 | 2572 | -yň |
2 | 1932 | -ar |
3 | 1873 | -an |
4 | 1728 | -da |
5 | 1551 | -iň |
rank | frequency | n-gram |
---|---|---|
1 | 1339 | -nyň |
2 | 1281 | -lar |
3 | 958 | -dan |
4 | 840 | -ler |
5 | 802 | -yny |
rank | frequency | n-gram |
---|---|---|
1 | 746 | -ynyň |
2 | 627 | -lary |
3 | 569 | -aryň |
4 | 528 | -iniň |
5 | 478 | -ynda |
rank | frequency | n-gram |
---|---|---|
1 | 528 | -laryň |
2 | 327 | -leriň |
3 | 273 | -rynyň |
4 | 256 | -yndan |
5 | 246 | -aryny |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings